346 PART 6 Analyzing Survival Data
Some points to keep in mind:»
» If your software outputs a zero-based baseline survival function, you don’t
subtract the average value from the patient’s value. Instead, calculate the v
term as the product of the patient’s predictor value multiplied by the regres-
sion coefficient.»
» If a predictor is a categorical variable, you have to code the levels as numbers.
If you have a dichotomous variable like pregnancy status, you could code not
pregnant = 0 and pregnant = 1. Then, if in a sample only including women,
47.2 percent of the sample is pregnant, the average pregnancy status is 0.472. If
the patient is not pregnant, the subtraction in Step 1 is 0 – 0.472, giving –0.472.
If the patient is pregnant, you would use the equation 1 – 0.472, giving 0.528.
Then you carry out all the other steps exactly as described.»
» It’s even a little trickier for multivalued categories (such as different clinical
centers) because you have to code each of these variables as a set of indicator
variables.
Estimating the Required Sample Size
for a Survival Regression
Note: Elsewhere in this chapter, we use the word power in its algebraic sense, such
as in x 2 is x to the power of 2. But in this section, we use power in its statistical
sense to mean the probability of getting a statistically significant result when
performing a statistical test.
Except for straight-line regression discussed in Chapter 16, sample-size calcula-
tions for regression analysis tend not to be straightforward. If you find software
that will calculate sample-size estimates for survival regression, it often asks for
inputs you don’t have.
Very often, sample-size estimates for studies that use regression methods are
based on simpler analytical methods. We recommend that when you’re planning
a study that will be analyzed using PH regression, you base your sample-size esti-
mate on the simpler log-rank test, described in Chapter 22. The free PS program
handles these calculations very well.